WIM at TREC 2007

نویسندگان

  • Jun Xu
  • Jing Yao
  • Jiaqian Zheng
  • Qi Sun
  • Junyu Niu
چکیده

This paper introduced the four tracks that WIM-Lab Fudan University had taken part in at TREC 2007. For spam track, a multi-centre model was proposed considering the characteristics of spam mails in contrast of traditional 2-class classification methodology, and the incremental clustering and closeness-based classification methods were applied this year. For enterprise track, our research was mainly focused on ranking functions of experts and selecting correct supporting documents regarding to a given topic. For legal track, the effects of word distribution model in query expansion and various corpus pre-processing methods were mainly evaluated. For genomics track, three score methods were proposed to find the most relevant text snippets to a given topic. This paper gives an overview of the methods employed for each sub tasks, and compares the results of each track.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Judging Expertise--WIM at Enterprise

This document reports experiment and result of Fudan WIM group in Expert Search track in TREC 2006. Our research mainly focus on the measurement of expertise. Inspired by the human procedure of expert search, we construct 2 models, a language model and a social network model are compared. The association function of expert and document is modified. And email search techniques are employed in do...

متن کامل

WIM at TREC 2005

This paper describes the three TREC tasks we participated in this year, which are, Genomics track’s categorization task and ad hoc task, and Enterprise track’s known item search task. For the categorization task, we adopt a domain-specific terms extraction method and an ontology-based method for feature selection. A SVM classifier and a Rocchio based two staged classifier were also used in this...

متن کامل

Overview of TREC 2007

The sixteenth Text REtrieval Conference, TREC 2007, was held at the National Institute of Standards and Technology (NIST) November 6–9, 2007. The conference was co-sponsored by NIST and the Intelligence Advanced Research Projects Activity (IARPA). TREC 2007 had 95 participating groups from 18 countries. Table 2 at the end of the paper lists the participating groups. TREC 2007 is the latest in a...

متن کامل

Using Profile Matching and Text Categorization for Answer Extraction in TREC Genomics

TREC’06 genomics track was focusing on text mining and passage retrieval. WIM lab participated in this year’s TREC genomics track. Our system consists of five parts: preprocessing, sentence generation, document retrieval, answer extraction and answer fusion. And we developed two different method: a automated profile matchingbased method and a text categorizationbased method to do the text minin...

متن کامل

MultiText Legal Experiments at TREC 2008

Our TREC 2008 e ort used fusion IR methods identical to those used for our TREC 2007 e ort; in addition we used logistic regression to attempt to learn the optimal K value for the primary F1@K measure introduced at TREC 2008. We used the Wumpus search engine combining several methods that have proven successful, including cover density ranking and Okapi BM25 ranking, and combination methods. St...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007